-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32080][SPARK-31998][SQL] Simplify ArrowColumnVector ListArray accessor #28915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32080][SPARK-31998][SQL] Simplify ArrowColumnVector ListArray accessor #28915
Conversation
|
This also avoid having to deal with an API change in the ArrowBuf class for Arrow 1.0.0 and will allow Arrow-Spark integration tests to run again in preparation for the 1.0.0 release, see apache/arrow#6316. |
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this also covers SPARK-31998?
| int index = rowId * ListVector.OFFSET_WIDTH; | ||
| int start = offsets.getInt(index); | ||
| int end = offsets.getInt(index + ListVector.OFFSET_WIDTH); | ||
| int start = accessor.getElementStartIndex(rowId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @viirya , I wasn't aware SPARK-31998 was made, but yes that was the motivation for this fix.
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Test build #124451 has finished for PR 28915 at commit
|
|
retest this please |
|
Test build #124470 has finished for PR 28915 at commit
|
|
Merged to master and branch-3.0. |
…accessor ### What changes were proposed in this pull request? This change simplifies the ArrowColumnVector ListArray accessor to use provided Arrow APIs available in v0.15.0 to calculate element indices. ### Why are the changes needed? This simplifies the code by avoiding manual calculations on the Arrow offset buffer and makes use of more stable APIs. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #28915 from BryanCutler/arrow-simplify-ArrowColumnVector-ListArray-SPARK-32080. Authored-by: Bryan Cutler <[email protected]> Signed-off-by: HyukjinKwon <[email protected]> (cherry picked from commit df04107) Signed-off-by: HyukjinKwon <[email protected]>
|
Thanks @HyukjinKwon and @viirya ! |
What changes were proposed in this pull request?
This change simplifies the ArrowColumnVector ListArray accessor to use provided Arrow APIs available in v0.15.0 to calculate element indices.
Why are the changes needed?
This simplifies the code by avoiding manual calculations on the Arrow offset buffer and makes use of more stable APIs.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing tests